A simple DOP model for constituency parsing of Italian sentences
نویسنده
چکیده
We present a simplified Data-Oriented Parsing (DOP) formalism for learning the constituency structure of Italian sentences. In our approach we try to simplify the original DOP methodology by constraining the number and type of fragments we extract from the training corpus. We provide some examples of the types of constructions that occur more often in the treebank, and quantify the performance of our grammar on the constituency parsing task.
منابع مشابه
A New DOP Model for Phrase-structure Parsing of Persian Sentences
In this paper we employ a most recent approach to Data Oriented Parsing (DOP), which has named Double-Dop, for Persian sentences. Like other DOP models, Double-Dop parser utilizes syntactic fragments of arbitrary size from a treebank to analyse new sentences, but it extracts a restricted yet representative subset of fragments. It uses only those which are encountered at least twice. The accurac...
متن کاملAccurate Parsing with Compact Tree-Substitution Grammars: Double-DOP
We present a novel approach to Data-Oriented Parsing (DOP). Like other DOP models, our parser utilizes syntactic fragments of arbitrary size from a treebank to analyze new sentences, but, crucially, it uses only those which are encountered at least twice. This criterion allows us to work with a relatively small but representative set of fragments, which can be employed as the symbolic backbone ...
متن کاملEvalita’09 Parsing Task: constituency parsers and the Penn format for Italian
The aim of Evalita Parsing Task is at defining and extending the state of the art for parsing Italian by encouraging the application of existing models and approaches. Therefore, as in the first edition, the Task includes two tracks, i.e. dependency and constituency. This second track is based on a development set in a format, which is an adaptation for Italian of the Penn Treebank format, and ...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملTree Kernels-based Discriminative Reranker for Italian Constituency Parsers
English. This paper aims at filling the gap between the accuracy of Italian and English constituency parsing: firstly, we adapt the Bllip parser, i.e., the most accurate constituency parser for English, also known as Charniak parser, for Italian and trained it on the Turin University Treebank (TUT). Secondly, we design a parse reranker based on Support Vector Machines using tree kernels, where ...
متن کامل